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ABSTRACT 

As Arabidopsis thaliana has colonized a wide range of habitats across the world it is an attractive 
model for studying the genetic mechanisms underlying environmental adaptation. Here, we used 
public data from two collections of A. thaliana accessions to associate genetic variability at 
individual loci with differences in climates at the sampling sites. We use a novel method to screen 
the genome for plastic alleles that tolerate a broader climate range than the major allele. This 
approach reduces confounding with population structure and increases power compared to standard 
genome-wide association methods. Sixteen novel loci were found, including an association between 
Chromomethylase 2 (CMT2) and variability in seasonal temperatures where the plastic allele had 
reduced genome-wide CHH methylation. Cmt2 mutants were more tolerant to heat-stress, 
suggesting genetic regulation of epigenetic modifications as a likely mechanism underlying natural 
adaptation to variable temperatures, potentially through differential allelic plasticity to temperature- 
stress. 



AUTHOR SUMMARY 

A central problem when studying adaptation to a new environment is the interplay between genetic 
variation and phenotypic plasticity. Arabidopsis thaliana has colonized a wide range of habitats 
across the world and it is therefore an attractive model for studying the genetic mechanisms 
underlying environmental adaptation. Here, we used publicly available data from two collections of 
A. thaliana accessions, covering the native range of the species, to identify loci associated with 
differences in climates at the sampling sites. To address the confounding between geographic 
location, climate and population structure, a new genome- wide association analysis method was 
developed that facilitates detection of potentially adaptive loci where the alternative alleles display 
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different tolerable climate ranges. Sixteen novel such loci, many of which contained candidate 
genes with amino acid changes, were found including a strong association between 
Chromomethylase 2 (CMT2) and variability in seasonal temperatures. The reference allele 
dominated in areas with less seasonal variability in temperature, and the alternative allele, which 
disrupts genome-wide CHH-methylation, existed in both stable and variable regions. We have 
experimentally shown that plants with a defective CMT2 gene tolerate heat-stress better than plants 
with a functional gene. Our results thus link natural variation in CMT2, and differential genome- 
wide CHH methylation, to the distribution of A. thaliana accessions across habitats with different 
seasonal temperature variability and show that disruption of CMT2 function improves heat-stress 
tolerance. The results therefore also suggest a role for genetic regulation of epigenetic modifications 
in natural adaptation to temperature, potentially through differential allelic plasticity, and illustrate 
the importance of re-analyses of existing data using new analytical methods to obtain a more 
complete understanding of the mechanisms contributing to adaptation. 



BLURB 

Disrupted Chromomethylase 2 (CMT2) is associated with plasticity to seasonal temperature 
variability, decreased CHH-methylation and improved heat-stress tolerance, suggesting a role for 
genetic regulation of epigenetic modifications in natural adaptation. 



INTRODUCTION 



Arabidopsis thaliana has colonized a wide range of habitats across the world and it is therefore an 
attractive model for studying the genetic mechanisms underlying environmental adaptation [1]. 
Several large collections of A. thaliana accessions have either been whole-genome re-sequenced or 
high-density SNP genotyped [1-7]. The included accessions have adapted to a wide range of 
different climatic conditions and therefore loci involved in climate adaptation will display genotype 
by climate-at-sampling-site correlations in these populations. Genome-wide association or 
selective-sweep analyses can therefore potentially identify signals of natural selection involved in 
environmental adaptation, if those can be disentangled from the effects of other population genetic 
forces acting to change the allele frequencies. Selective-sweep studies are inherently sensitive to 
population-structure and, if present, the false-positive rates will be high as the available statistical 
methods are unable to handle this situation properly. Further experimental validation of inferred 
sweeps (e.g. [1,8]) is hence necessary to suggest them as adaptive. In GWAS, kinship correction is 
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now a standard approach to account for population structure that properly controls the false 
discovery rate. Unfortunately, correcting for genomic kinship often decreases the power to detect 
individual adaptive loci, which is likely the reason that no genome -wide significant associations to 
climate conditions were found in earlier GWAS analyses [1,8]. Nevertheless, a number of candidate 
adaptive loci could despite this be identified using extensive experimental validation [1,2,8], 
showing how valuable these populations are as a resource for finding the genomic footprint of 
climate adaptation. 

RESULTS 

Genome-wide association analysis to detect loci with plastic response to climate 

Here, we re-analyze the data from the RegMap collection to find loci contributing to climate 
adaptation through an alternative mechanism: genetic control of plasticity. Such loci are unlikely to 
be detected with standard GWAS or selective-sweep analyses as they have a different genomic 
signature of selection and distribution across climate envelopes [9]. We extend and utilize an 
approach [10,11] that instead of mapping loci by differences in allele-frequencies between local 
environments, which is highly confounded by population structure, infer adaptive loci using a 
heterogeneity-of-variance test. This identifies loci where the minor allele tolerate a broader range of 
climate conditions than the major allele [10]. As such plastic alleles will be present across the entire 
population, they are less confounded with population structure and detectable in our GWAS 
analysis that utilizes kinship correction to account for population stratification. A genome-wide 
association analysis was performed for thirteen climate variables across -215,000 SNPs in 948 A. 
thaliana accessions from the RegMap collection, representing the native range of the species [1,12]. 
In total, sixteen genome- wide significant loci were associated with eight climate variables (Table 1), 
none of which could be found using standard methods for GWAS analyses [1,8,13-15]. The effects 
were in general quite large, from 0.3 to 0.5 residual standard deviations (Table 1), corresponding to 
increases in the climate plasticity of 21-35% from the alternate allele. The detailed results for each 
trait are reported in Text SI (Supplementary Figure 1-13). The distributions of the significant plastic 
alleles across the population strata in relation to their geographic origin and the climate envelopes 
are provided in Text SI (Supplementary Figure 14-35). 

Fine-mapping and identification of candidate mutations in the 1001-genomes data 

Utilizing data from the 1001-genomes project [2-7] (http://1001genomes.org), we fine-mapped the 
significant loci and identified five functional candidate genes (Table 1) and 11 less well 
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characterized genes (Table SI) with either missense, nonsense or frameshift mutations in high 
linkage disequilibrium (LD; r 2 > 0.8) with the leading SNP of each locus. 76 additional linked loci 
or genes without candidate mutations in the coding regions are reported in Table S2. 

Chromomethylase 2 (CMT2) is associated with seasonal temperature variability in the 
RegMap collection 

A strong association to temperature seasonality, i.e. the ratio between the standard deviation and the 
mean of temperature records over a year, was identified near Chromomethylase 2 (CMT2; Table 1; 
Figure 1). Stable areas are generally found near large bodies of water (e.g. London near the Atlantic 
11 ± 5°C; mean ± SD) and variable areas inland (e.g. Novosibirsk in Siberia 1 ± 14°C). A premature 
CMT2 stop codon located on chromosome 4 at 10 414 556 bp segregated in the RegMap collection 
This CMT2stop allele had a genome-wide significant association to temperature seasonality (P = 
l.lxlO" 7 ) and was in strong LD (r 2 = 0.82) with the leading SNP (Figure IB). The geographic 
distributions of the wild-type (CMT2wt) and the alternative (CMT2stop) alleles in the RegMap 
collection shows that the CMT2wt allele dominates in all major sub-populations sampled from areas 
with low or intermediate temperature seasonality. The plastic CMT2stop allele is present, albeit at 
lower frequency, across all sub-populations in low- and intermediate temperature seasonality areas, 
and is more common in areas with high temperature seasonality (Figure 2A; Figure 3; Text SI, 
Supplementary Figure 36). Such global distribution across the major population strata indicates that 
the allele has been around in the Euroasian population sufficiently long to spread across most of the 
native range and that lack of functional CMT2 is not deleterious but rather maintained through 
balancing selection [9], perhaps through an improved tolerance to variable temperatures. 

Broader geographic distribution of the CMT2stop allele in the 1001-genomes collection 

To confirm that the CMT2stop association was not due to sampling bias in the RegMap, we also 
scored the CMT2 genotype and collected the geographical origins from 665 accessions in the 1001- 
genomes project ( http://1001genomes.org ) [2,3,5-7]. In this more geographically diverse set (Figure 
2A), CMT2stop was more common (MAF = 0.10) and had a similar allele distribution across 
Euroasia as in RegMap (Text SI, Supplementary Figure 36-37). Two additional mutations were 
identified on unique haplotypes (r2 = 0.00) - one nonsense CMT2stop2 at 10416213 bp (MAF = 
0.02) and a frameshift mutation at 10 414 640 bp (two accessions). Both CMT2stop and CMT2stop2 
had genotype-phenotype maps implying a plastic response to variable temperature (Figure 2B) and 
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the existence of multiple mutations disrupting CMT2 further suggest lack of CMT2 function as a 
potentially evolutionary beneficial event. 

Accessions with the CMTIstop allele has an altered genome-wide CHH-methylation pattern 

CMT2 is a plant DNA methyltransferase that methylates mainly cytosines in CHH (H = any base 
but G) contexts, predominantly at transposable elements (TEs) [16,17]. We tested the effect of 
CMTIstop on genome-wide DNA methylation using 131 CMT2wt and 17 CMT2stop accessions, for 
which MethylC-sequencing data was publicly available [7]. A methylome-wide association (MWA) 
analysis between CMT2stop and the methylation-state at ~6 million single methylation 
polymorphisms (SMPs) identified -3,000 methylome-wide significant CHH associations (Text SI, 
Supplementary Figure 38). Methylation was homogenous across the clear majority of the CHH-sites 
for the CMT2wt accessions, but not for CG or CHG-sites (Text SI, Supplementary Figure 39). 
Interestingly, the methylation-pattern is more heterogenous among the CMT2stop accessions, with a 
small overlap both among CMT2stop accessions and between CMT2stop and CMT2wt accessions, 
indicating a shared residual, non-CMT2 mediated CHH methylation. We confirmed that the 
differential methylation detected in the CMT2stop and CMT2wt accessions is consistent with the 
effects of disrupting CMT2, by showing that the level of CHH-methylation across the MWA 
detected sites was significantly lower (P = 2.2 xlO" 4 ) in four T-DNA insertion cmt2 knock-out 
alleles [16] than in CMT2wt plants (Figure 4A; Text SI, Supplementary Figure 39B). 

Cmt2 mutant plants have an improved heat-stress tolerance 

To functionally explore whether lack of CMT2 alters the response to temperature-stress, we 
subjected Col-0 and two independent cmt2 mutants to heat-stress. The cmt2 mutants had 
significantly higher survival-rate (2.1-fold; P = 2.7 x 10" 3 ; Figure 4B) than Col-0. This striking 
improvement in tolerance to heat-stress of cmt2 plants suggests CMJ2-dependent CHH methylation 
as an important alleviator of stress responses in A. thaliana and a candidate mechanism for 
temperature adaptation. 

DISCUSSION 

A major challenge in attempts to identify individual loci involved in climate adaptation is the strong 
confounding between geographic location, climate and population structure in the natural A. 
thaliana population. Earlier genome- wide association analyses in large collections of natural 
accessions experienced a lack of statistical power when correcting for population-structure [1,8]. 
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We used an alternative GWAS approach [10] to test for a variance-heterogeneity, instead of a mean 
difference, between genotypes. This analysis identifies loci where the minor allele is more plastic 
(i.e. exist across a broader climatic range) than the major allele. In contrast, a standard GWAS map 
loci where the allele-frequencies follow the climatic cline. Although plastic alleles might be less 
frequent in the genome, they are easier to detect in this data due to their lower confounding with 
population-structure. This overall increase in power is also apparent when comparing the signals 
that reach a lower, sub-GWAS significance level (Text SI, Supplementary Figure 40-44). In the 
CM72-locus, that was strongly associated with variable seasonal temperatures, the plastic allele had 
a disrupted genome-wide CHH methylation pattern similar to that of cmt2 mutant plants. 
Interestingly, we could also show that cmt2 mutants had an increased tolerance to heat-stress, a 
finding that both strongly implicates CMT2 as an adaptive locus and clearly illustrates the potential 
of our method as a useful approach to identify novel associations of functional importance. 

It is not clear via which mechanism CMT2-dependent CHH methylation affects plant heat tolerance. 
We consider it most likely that the effect will be mediated by TEs in the immediate neighborhood of 
protein-coding genes. Heterochromatic states at TEs can affect activity of nearby genes and thus 
potentially plant fitness [18]. Consistent with a repressive role of CMT2 on heat stress responses, 
CMT2 expression is reduced by several abiotic stresses including heat [19]. Because global 
depletion of methylation has been shown to enhance resistance to bio tic stress [20], it is possible 
that DNA-methylation has a broader function in shaping stress responses than currently thought. 

We identified several alleles associated with a broader range of climates across the native range of 
A. thaliana, suggesting that a genetically mediated plastic response might of important for climate 
adaptation. A defective epigenetic mechanism involving CMT2 mediated CHH-methylation was 
strongly associated with adaptation to variability in seasonal temperatures and cmt2 plants tolerated 
heat-stress better than wild- type plants. Together, these findings suggest genetically determined 
epigenetic variability as a likely mechanism contributing to phenotypic plasticity of adaptive 
advantage in natural environments. 

MATERIALS AND METHODS 

Climate data and genotyped Arabidopsis thaliana accessions 

The climate phenotypes and A thaliana genotype data that we re-analyzed were obtained from [1]. 
13 climate variables and genotypes of 214,553 single nucleotide polymorphisms (SNPs) for 948 
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accessions were available at: http://bergelson.uchicago.edu/regmap-data/climate-genome-scan. The 
climate variables used in the analyses were: aridity, number of consecutive cold days (below 4 
degrees Celsius), number of consecutive frost-free days, day-length in the spring, growing-season 
length, maximum temperature in the warmest month, minimum temperature in the coldest month, 
temperature-seasonality, photosynthetically active radiation, precipitation in the wettest month, 
precipitation in the driest month, precipitation-seasonality, and relative humidity in the spring. No 
squared pairwise Pearson's correlation coefficients between the pheno types were greater than 0.8 
(Figure S7 of [1]). 

Phenotyping of temperature seasonality in Euro-Asia for the 1001 -genomes (http:// 
1001genomes.org) accessions was done by downloading the raw data from http:// 
www.worldclim.org/ . The data were re-formatted and thereafter processed by the raster package in 
R. The R code is provided in the Text S 1 . 

Genome-wide association analysis to identify adaptability loci in the RegMap collection 

Genome-wide association (GWA) datasets based on natural collections of A thaliana accessions, 
such as the RegMap collection, are often genetically stratified. This is primarily due to the close 
relationships between accessions sampled at nearby locations. Furthermore, as the climate 
measurements used as phenotypes for the accessions are values representative for the sampling 
locations of the individual accessions, these measurements will be confounded with the general 
genetic relationship [12]. Unless properly controlled for, this confounding might lead to excessive 
false-positive signals in the association analysis; this as the differences in allele-frequencies 
between loci in locations that differ in climate, and at the same time are geographically distant, will 
create an association between the genotype and the trait. However, this association could also be 
due to other forces than selection. In traditional GWA analyses, mixed-model based approaches are 
commonly used to control for population-stratification. The downside of this approach is that it, in 
practice, will remove many true genetic signals coming from local adaptation due to the inherent 
confounding between local genotype and adaptive phenotype. Instead, the primary signals from 
such analyses will be due to effects of alleles that exist in, and have similar effects across, the entire 
studied population. In general, studies into the contributions of genetic variance-heterogeneity to 
the phenotypic variability in complex traits is a novel and useful approach with great potential [21]. 
Here, we have developed and used a new approach that combines a linear mixed model and a 
variance-heterogeneity test, which addresses these initial concerns and shown that it is possible to 
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infer statistically robust results of genetically regulated phenotypic variability in GWA data from 
natural populations. 

Statistical modeling in genome-wide scans for adaptability 

The climate data at the geographical origins of the A. thaliana accessions were treated as 
phenotypic responses. Each climate phenotype vector y for all the accessions were normalized via 
an inverse-Gaussian transformation. The squared normalized measurement^ = y\ of accession % is 

modeled by the following linear mixed model to test for an association with climate adaptability 
(i.e. a greater plasticity to the range of the environmental condition): 

Z; = \i + (3Xi + Qi + e* 

where /i is an intercept, Xi the SNP genotype for accession z, j3 the genetic SNP effect, 

g ~ MVN(0, GVj) the polygenic effects ande ~ MVN(0, la 2 e ) the residuals. x t is coded 0 and 

2 for the two homozygotes (inbred lines). The genomic kinship matrix G* is constructed via the 
whole-genome generalized ridge regression method HEM (heteroscedastic effects model) [11] as 
G* = ZWZ', where Z is a number of individuals by number of SNPs matrix of genotypes 
standardized by the allele frequencies. W is a diagonal matrix with element wjj = bj/(l — hjj) for 

the j-th SNP, where bj is the SNP-BLUP (SNP Best Linear Unbiased Prediction) effect estimate for 
the j-th SNP from a whole-genome ridge regression, and hjj is the hat-value for the j-th SNP. 
Quantities in W can be directly calculated using the bigRR package [1 1] in R. An example R 
source code for performing the analysis is provided in the Text S 1 . 

The advantage of using the HEM genomic kinship matrix G*, rather than an ordinary genomic 
kinship matrix G = ZZ', is that HEM is a significant improvement of the ridge regression (SNP- 
BLUP) in terms of the estimation of genetic effects [11,22]. Due to this, the updated genomic 
kinship matrix G* better represents the relatedness between accessions and also accounts for the 
genetic effects of the SNPs on the phenotype. 

Testing and quality control for association with climate adaptability 
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The test statistic for the SNP effect (3 is constructed as the score statistic [23]: 

2 _ (x'G*-!z) 2 
x'G*" 1 :* 

implemented in the GenABEL package [24], where x = x — E[x] are the centered genotypic values 

and z = z — E[z] the centered phenotypic measurements. The T 2 statistic has an asymptotic 

distribution with 1 degree of freedom. Subsequent genomic control (GC) [25] of the genome- wide 
association results was performed under the null hypothesis that no SNP has an effect on the climate 
phenotype. SNPs with minor allele frequency (MAF) less than 0.05 were excluded from the 
analysis. A 5% Bonferroni-corrected significance threshold was applied. As suggested by [26], the 
significant SNPs were also analyzed using a Gamma generalized linear model to exclude positive 
findings that might be due to low allele frequencies of the high-variance SNP. 

Functional analysis of polymorphisms in loci with significant genome-wide associations to 
climate 

All the loci that showed genome-wide significance in the association study was further 
characterized using the genome sequences of 728 accessions sequenced as part of the 1001- 
genomes project ( http://1001genomes.org ). Mutations within a ±100Kb interval of each leading 
SNP, and that are in high linkage disequilibrium (LD) with the leading SNP (r 2 > 0.8), were 
reported (Table SI). The consequences of the identified polymorphisms were predicted using the 
Ensembl variant effect predictor [27] and their putative effects on the resulting protein estimated 
using the PASE (Prediction of Amino acid Substitution Effects) tool [28]. 

Methylome-wide association analysis and validation of CMT2stop genotypes 
A methylome-wide association (MWA) analysis was conducted to the CMT2stop genotypes at 
43,182,344 scored single methylation polymorphisms (SMPs) across the genome [7]. 131 CMT2wt 
and 17 CMT2stop accessions, for which MethylC-sequencing data was publicly available at http:// 
www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE43857 . were included in the analysis. Sites that 
were methylated at a frequency less than 0.05 among the accessions were removed from the 
analysis, resulting in 6,120,869 methylation sites to be tested across the genome. In this set, we 
tested for an association between the CMT2stop genotype and the methylation state at each of the 
6,120,869 SMPs using the qtscore () function in the GenABEL package. 
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In total, 3,096 methylome-wide SMPs were significant at a Bonferroni-corrected significance 
threshold for 6,120,869 tests. Divided according to the type of methylated sites, they corresponded 
to 879 CHH, 1162 CG and 731 CHG SMPs. To visualize the pairwise similarity between accessions 
at these sites, we computed an identity-by-methylation-state (IBMS) matrix using all the significant 
CHH sites (Text SI, Supplementary Figure 39A). To validate that the high degree of shared 
methylated sites was a useful predictor for the CMT2stop genotype, we downloaded data from an 
independent experiment [16] that contained methylome data on four CMT2 knockouts and four WT 
samples (GSM1083504, GSM1083505, GSM1083506, GSM1014134, GSM1014135, 
GSM1093622 and GSM1093629 from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi? 
acc=GSE41302 ). In total, the data from [16] contained scores for 718/213/262 of the differentially 
methylated CHH/CHG/CG sites. The level of CHH/CHG/CG methylation was scored in each of the 
eight samples as the sum of the methylation levels across all these CHH/CHG/CG sites. The 
respective methylation-levels for all samples are provided in Text SI, Supplementary Figure 39B/C/ 
D and a t-test shows that the methylation-level was significantly different between lines with 
dysfunctional and WT CMT2 for CHH sites, but not for CHG or CG sites. Together these results 
clearly shows that the CMT2stop accessions carry a mutated CMT2 allele. 

Our IBMS results (Text SI, Supplementary Figure 3 9 A) indicated that four of the 17 CMT2 S top 
accessions (En-D, Fi-0, Stw-0 & Vind-1) displayed a CHH-methylation pattern across the 
differentially methylated sites that was closer to the CMT2wt phenotype. Interestingly, an evaluation 
of the CMT2 mRNA abundance in these accessions using data from [7] showed that these lines also 
had higher transcript levels than other CMT2stop accessions. Although CMT2stop is a strong 
candidate mutation to cause the mutant CMT2 methylation phenotype, it is not possible to rule out 
that it is only in strong LD with an alternative causative variant or that other mechanisms are 
involved that causes the obligatory epialleles across the genome to be reverted by other 
compensatory mechanisms. Regardless, the results clearly show that the differential CHH- 
methylation phenotype is caused by a loss-of-function CMT2 allele. 

Heat-stress treatments on Col-0 and cmt2 knockouts 

Seeds of Col-0, cmt2-5 (SAIL 906 G03) and cmt2-7 (WiscDSLox471G8) were plated on 1/2 MS 
medium (0.8% agar, 1% sucrose), stratified for two days at 4°C in the dark and transferred to a 
growth chamber with 16h light (110 umol nr 2 s _1 , 22°C) and 8h dark (20°C) periods. Ten-day old 
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seedlings were transferred to 4°C for one hour and subsequently placed for 24h at 37.5°C in the 
dark. Plant survival was scored two days after heat stress. 

No difference in survival rate was found between the two knockouts cmt2-5 and cmt2-7. A log- 
linear regression was conducted to test for the difference in survival rate between Col-0 and cmt2 
knockouts, i.e. 



where 5^ is the number of survived plants of accession %, U the corresponding total number of 
plants, Ei the experiment effect, ctj the accession effect, and /3 0 an intercept. The model fitting 
procedure was implemented using the glm() procedure in R, with option 

family = gaussian(link = log),s; as response, U as offset, and /3 0 , E i} di as fixed effects. 
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Figure 1. An LD block associated with temperature seasonality contains CMT2. A genome- wide 
significant variance-heterogeneity association signal was identified for temperature seasonality in 
the RegMap collection of natural Arabidopsis thaliana accessions [1]. The peak on chromosome 4 
around 10 Mb (A) mapped to a haplotype block (B) containing a nonsense mutation (CMT2stop) 
early in the first exon of the Chromomethylase 2 (CMT2) gene. Color coding based on |r| (the 
absolute value of the correlation coefficient) as a measure of LD between each SNP in the region 
and the leading SNP in the association analysis. 
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Figure 2. Geographic distribution of, and heterogenous variance for, three CMT2 alleles in two 
collections of A thaliana accessions. The geographic distributions (A) of the wild-type (CMT2wr, 
gray circles) and two nonsense alleles (CMT2stop/CMT2stop2 ; filled/open triangles) in the CMT2 
gene that illustrates a clustering of CMT2wt alleles in less variable regions and a greater dispersion 
of the nonsense alleles across different climates both in the RegMap [1] (blue) and the 1001- 
genomes [2](red) A. thaliana collections. The resulting variance-heterogeneity in temperature 
seasonality between genotypes is highly significant, as illustrated by the quantile plots in (B) where 
the median is indicated by a diamond and a bar representing the 25% to 75% quantile range. The 
color scale indicate the level of temperature seasonality across the map. The colorkey in (A) 
represent the temperature seasonality values, given as the standard-deviation in % of the mean 
temperature (K). 
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Figure 3. Principle components of the genomic kinship in the RegMap collection for the accessions 
carrying the alternative alleles at the Chromomethylase 2 locus (CMT2stop and CMT2wt as filled 
and empty circles, respectively). Coloring is based on (A) geographical regions (defined as in 
Supplementary Figure 37) and (B) temperature seasonality, ranging from dark blue (least variable) 
to red (most variable). 
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Figure 4. Differential CMT2 -mediated CHH methylation pattern, and rate of survival under heat- 
stress, of cmt2 mutants. A. The CHH methylation in cmt2 (cmt2-3, cmt2-4, cmt2-5 and cmt2-6) 
plants is significantly lower at the CHH sites mapped in the methylome-wide association analysis 
between the natural accessions carrying the CMT2stop and CMT2wt alleles, implying CMT2stop as 
a natural cmt2 allele. B. The survival rate is significantly higher for cmt2 (cmt2-5 and cmt2-7) than 
for Col-0 plants under heat-stress (24 h at 37.5°C), illustrating the functional role of CMT2 in 
temperature-stress response. P-values in A and B were obtained using a log-linear regression. 
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Table: 



Table 1. Loci with genome-significant, non-additive effects on climate adaptation and a functional 
analysis of nearby genes (r 2 > 0.8) containing missense or nonsense mutations. 
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a ' b ' c ' d ' e Loci affecting affect multiple traits; ^he predicted functional effect score for the strongest 
mis-sense mutation in the gene based on amino-acid physiochemical properties (PASE) and 
evolutionary conservation (MSA) [28]; 2 MAF: Minor Allele Frequency; 3 P-value: significance after 
genomic-control from a linear regression analysis of squared z-scores accounting for population 
stratification. 4 Effect: Standardized genetic effect on adaptability (Chi-square distributed) ± 
standard error (unit: phenotypic standard deviations); 5 #Mut: number mis- and non-sense mutations 
in the gene in the 1001 -genomes dataset [2]. 6 Locus contains two missense mutations with equally 
strong predicted effects 
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Other Results 



CMT2 is a potential target for the nonsense-mediated RNA decay (NMD) pathway 

To further explore the potential mechanism underlying the observed effect of CMT2, as well as the 
heterogeneity within the group of mutant accessions, we also studied the level of mRNA in the plants. 
The motivation for this evaluation was that the CMT2 STO p allele will produce an mRNA with a pre- 
mature translation termination codon, which makes it a likely target for the nonsense-mediated mRNA 
decay (NMD) pathway. Thus, the expectation is that accessions carrying this allele would have lower 
transcript abundance than the wild-type. For this study, we used data from two studies that contained 
data both on the genotype for CMT2 as well as RNA-seq data for the same lines. First, we analyzed the 
19 genomes project data 28 that contained full genome sequences and transcriptomes for 19 A. thaliana 
accessions, 2 of which (Ct-1 and Kn-0) are part of the RegMap panel and carries the CMT2 STO p mutation 
according to their 250k SNP-chip genotypes 13 . Utilizing data from both the biological replicates of 
seedling mRNA, the difference in mRNA abundance was highly significant between mutant and wild 
type accessions (P = 6.9 x 10~ 5 ), with a higher expression in the wild-type. A similar analysis was done 
using a larger data set from 7 , with CMT2 genotypes obtained from SNPs called from whole-genome re- 
sequencing data and RNA-seq data was obtained from analyses of leaf tissue in 14 CMT2 S top and 92 
CMT2 WT accessions. Here, the average mRNA abundance was higher for the wild-type accessions, but 
the difference was not significant in the complete dataset (P = 0.14). However, the mRNA levels were 
significantly higher among the four mutant accessions that displayed a methylation pattern resembling 
that of the wild-type in the analysis above (t-test; P = 0.01) and when those lines were removed from 
the comparison, the levels of mRNA was significantly higher in the wild-type accessions than in the ten 
remaining mutants (t-test; P = 0.02). These results indicate that CMT2 mRNA levels are influenced 
by the genotype and that it is connected to the methylation state in the plant, but provide no conclusive 
evidence on the functional connection between the two. 

VEL1 and adaptation to day length 

A thaliana is a facultative photoperiodic flowering plant and hence non-inductive photoperiods will delay, 
but not abolish, flowering. The genetic control of this phenotypic plasticity is thus an adaptive trait. A 
significant association was detected near two genes, VEL1 and XTH19, containing two and one non- 
synonymous amino acid substitutions, respectively (Table 1). The major allele was dominant in short-day 
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regions, whereas the alternative allele was more plastic in relation to day-length. XTH19 has been implied 
as a regulator of shade avoidance 29 , but information about its potential involvement in regulation of 
photoperiodic length is lacking. VEL1, regulates the epigenetic silencing of genes in the FLC-pathway in 
response to vernalization 30 and photoperiod length 31 resulting in an acceleration of flowering under non- 
inductive photoperiods. A feasible explanation for the existence of an adaptability VEL1 -allele could thus 
be that accelerated flowering is beneficial under short-day conditions, but that also lack of accelerated 
flowering is allowed. In long-daytime areas, however, accelerated flowering might be detrimental as day- 
length follows a latitudinal cline, where early flowering might be detrimental in northern areas where 
accelerated flowering when the day-length is short could lead to excessive exposure to cold temperatures 
in the early spring and hence a lower fitness. 

Source Codes 

Example R source code for calculating HEM genomic kinship matrix 

Here, we use the example data in the bigRR package: http://cran.r-project.org/web/packages/bigRR/ 
to illustrate how an ordinary identity-by-state (IBS) kinship matrix can be update to a HEM genomic 
kinship matrix. The full theoretical details on this procedure are provided in . 

# load the bigRR package 
require (bigRR) 

# load the example data 
data (Arabidopsis ) 

X <- matrix (1, length (y) , 1) 
Z <- scale(Z) 

# fitting SNP-BLUP, i.e. a ridge regression on all the markers across the genome 
SNP.BLUP <- bigRR(y = y, X = X, Z = Z, family = binomial ( link = 'logit')) 

# calculate HEM (heteroscedastic effects model) genomic kinship matrix 
w <- as .numeric (SNP .BLUP$u"2/ (1 - SNP . BLUP$leverage) ) 

wZt <- sqrt (w) *t (Z) 
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G <- crossprod (wZt ) 

Temperature seasonality phenotype preparation in R 

We present the source code for phenotyping of temperature seasonality in Euro-Asia. The data down- 
loaded from http://www.worldclim.org/ were processed using the following code to obtain an object 
readable by the raster package: http://cran.r-project.org/web/packages/raster/. 

# load the original data files 

bil_files <- grepC'.bil", dir ( "tmean_30s_bil Folder/"), value = T) 
bil_f ile_order <- as . numeric ( sub (pattern = ".+_([ 0-9 ]+). bil " , 

x = bil_files, replacement = "\\1")) 
bil_files <- bil_files [order (bil_file_order) ] 

# create rasters 
WorldClim_stack <- stack () 
for (bil_file in bil_files){ 

r <- raster (paste ( "tmean_30s_bil Folder/ ", bil_file, sep = "")) 

WorldClim_stack <- addLayer (WorldClim_stack, r) 

} 

# temperature seasonality calculation 
r_mean <- calc (WorldClim_stack, mean) 
save(r_mean, file = "WorldClim_mean . Rdata" ) 
writeRaster ( r_mean, file = "WorldClim_mean . raster " ) 
r_sd <- calc (WorldClim_stack, sd) 

save (r_sd, file = "WorldClim_sd . Rdata" ) 
writeRaster ( r_sd, filename = "WorldClim_sd . raster " ) 
r_mean_corr <- r_mean/10 + 2 7 3.15 

save (r_mean_corr, file = " WorldClim_mean_corr . Rdata" ) 
r_coeff_var <- 100* (r_sd/10) /r_mean_corr 

# output to a raster object 
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writeRaster (r_coef f_var, file = "WorldClim_coef f_var . raster " ) 

# phenotyping at given coordinates 

# (LONGITUDE and LATITUDE already loaded) 
require (raster) 

world_temp_seas <- raster (' WorldClim_coef f_var' ) 

temp_seas <- raster :: extract (world_bio5, cbind (LONGITUDE, LATITUDE)) 

Further General Discussions 

Statistical properties of the vGWAS in relation to population stratification 

Inherent properties of the variance heterogeneity test decreases risk of identifying locally adapted 
alleles An important property of the variance heterogeneity GWAS analysis is that it is inherently more 
powerful in detecting loci where the minor allele is associated with a higher variance than the major 
allele. In practice, there is no power to detect low-variance minor alleles in a GWAS setting 10 ' 22 . Hence, 
the method facilitates detection of alternative (i.e. minor) alleles associated with a broader range of the 
climate variables than the reference (i.e. major) alleles. The method is powerful in finding associations 
to minor alleles associated with a broader range of climate-variables than the reference. Such alleles will, 
by definition, be present across a large proportion of the global population and due to this be considerably 
less affected by population structure. In Figure 3 or Supplementary Figure 15 (top panel), we illustrate 
this property for the inferred loci using the CMT2 locus as an example. The MDS-plot visualizes the 
distribution of the CMT2 ST0P allele across the population structure present in the RegMap collection 
using the pairwise genome-wide relationship between the accessions based on the first two principle- 
components of the kinship matrix. The link between the geographic origin of the accessions and kinship 
is visualized by coloring the dots for each accession based on geographic origin. As expected, accessions 
from nearby regions (e.g. UK, Scandinavia and mainland Europe) are more related. The CMT2 ST0P allele 
is, however, not heavily confounded with population-structure and is present in most major sub-groups 
of the population (albeit with a higher frequency in Asia - see Supplementary Fig. 36). 

Mixed models based vGWAS analyses to account for population structure via modeling of genome- 
wide kinship We statistically deal with the strong correlations that exist between climate & population 
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structure (e.g. along east-west/north-south clines) using a mixed-model based approach accounting for 
genomic kinship combined with genomic control. This approach has earlier been shown to control type I 
errors (i.e. genome-wide P-value inflation) in structured populations. The major challenge in analyzing 
this population is thus not the false-positive rate, as also standard GWAS analyses can be implemented in 
the same mixed-models framework, but rather to avoid unacceptably high type II errors (i.e. low power) 
for traits confounded with population-structure. Traditional GWAS analyses model alleles to have a 
linear relationship with climate, which in practice means that they mostly coincide with the population- 
structure along geographic clines. Hence, analyses will either be prone to identify false associations 
(when population-structure is not accounted for), or be under-powered (when accounting for population- 
structure). Although this is not explicitly discussed in the earlier reports based on this data, this is the 
primary reason for their lack of genome-wide significant associations to individual adaptive loci. As 
illustrated in Figure 3, the variance-heterogeneity test identifies loci present across population strata, 
where the signal therefore remain even after accounting for population structure via the mixed-model 
approach. The independence between the effect of the inferred locus and population structure can be 
evaluated statistically by fitting a linear mixed model where the genotype is regressed on the genomic 
kinship, where the heritability differs from 0 when confounding is present. For CMT2 this estimate is 
zero, showing that the CMT2 genotype is not confounded with population structure in this data. 

On the power of vGWAS and GWAS analyses in highly structured populations There are several 
reasons for why a low overlap is expected between the results from traditional GWAS/selective-sweep 
analyses (as performed earlier) and the variance-heterogeneity GWAS (vGWAS) used here. First, in the 
absence of population stratification, the GWAS is more powerful than the vGWAS. In the presence of 
population stratification, however, loci affecting the mean phenotype will often be highly confounded 
with population-structure as they are a main genetic mechanism leading to local adaptation. In order 
to infer such loci when controlling for population structure, the same alleles need to have been under 
selection in multiple, unrelated populations, which is apparently a rare event as no such loci could be 
detected in the earlier studies of climate adaptation. The population genetics forces acting on variance- 
controlling loci are still poorly explored. Studies have, however, shown that high- and low-variance 
alleles are likely to co-exist in the population over extended periods of time at a frequency balanced 
depending on fluctuations in the surrounding environment that the population adapts to 9 . Due to this, 
both alleles are more likely to be present across different population strata than mean affecting alleles 
and therefore be less confounded with population structure. In Supplementary Figure 36, we exemplify 
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this by visualizing the allele-frequency of the CMT2 S top allele across the major sub-populations in the 
RegMap and 1001-genomes data. Although the uneven sampling across the regions in the 1001-genomes 
makes allele-frequency estimates uncertain, the overall picture shows that minor allele is present across 
all sub-populations at a lower frequency and that it has increased in frequency in Asia. 

Second, a traditional GWAS searches for difference in means between genotypes, whereas the vG- 
WAS searches for differences in variances between genotypes. As these are two different statistical 
properties of the phenotypic distribution, the basic assumption is that they are both statistically and bio- 
logically unrelated and consequently the loci identified by the two methods are not expected to overlap. 
Although some degree of overlap might be expected, e.g. in situations where the variance scales with the 
mean, the high-significance required to reach genome-wide significance in the testing, in practice only 
loci with strong, pure effects on one of the statistical moments seem to be able to reach such significance 
levels (see e.g. 10 ). Formal comparisons between the results in the association studies will thus be mis- 
leading to the readers, as these will indicate that an overlap is to be expected. Results from evaluations 
of the overlap for sub-GWAS signals to explore the potential overlap of loci with weaker effects on both 
the mean and the variance shows some overlap (Supplementary Fig. 40-44). It should be noted, how- 
ever, that comparisons of overlap at individual loci is not appropriate at these significance levels due to 
the lack of proper control of the type I error rate. The overall conclusions from these comparisons is i) 
that the power is generally very low for the GWAS after control for population stratification and ii) that 
even the sub-GWAS overlap is low for the two methods, but the overlap that exists is consistent with the 
correlation between the climate variables. 

Third, the earlier studies have also inferred loci using traditional selective-sweep mapping. These 
analyses are designed to infer hard selective-sweeps where (potentially) adaptive alleles are assumed to 
have increased in frequency due to directional selection. As discussed above, the population genomic 
dynamics of plastic alleles does not follow the same pattern as for alleles affecting the mean (see e.g. 9 ), 
leading to a co-existence of the alleles over prolonged periods of time. This means that they will not be 
surrounded by a traditional genomic footprint of directional selection that can be detected in a selective- 
sweep analysis and one would not expect any overlap between the loci inferred in the selective sweep 
and vGWAS based analyses. 
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a: Phenotypic and p-value distributions. 

Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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a: Phenotypic and p-value distributions. 

Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 2 - Summary of results for maximum temperature in the warmest month. 
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a: Phenotypic and p-value distributions. 

Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 3 - Summary of results for minimum temperature in the coldest month. 
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a: Phenotypic and p-value distributions. 




b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 5 - Summary of results for precipitation in the driest month. 
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a: Phenotypic and p-value distributions. 




b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 6 - Summary of results for precipitation CV. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 



o _ 

CO - 

a. 




0 10 20 0 10 0 10 20 0 10 0 10 20 

Genome Position (Mb) 



Supplementary Figure 7 - Summary of results for photosynthetically active radiation in spring. 
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a: Phenotypic and p-value distributions. 

Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 8 - Summary of results for length of the growing season. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 9 - Summary of results for number of consecutive cold days. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 10 - Summary of results for number of consecutive frost-free days. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 
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b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Supplementary Figure 11 - Summary of results for relative humidity in spring. 
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a: Phenotypic and p-value distributions. 



Top-left: phenotypic distribution; Top-right: -logiop-values after genomic control (GC) against minor allele frequen- 
cies (MAF); Bottom panels: Quantile-quantile plots of p-values and -logi 0 /?-values before (blue) and after (green) 
GC. 




0.0 0.2 0.4 Q.fi 0.3 1.0 D 2 4 fi 3 10 

Expected P Expected - IngmP 



b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 




Supplementary Figure 12 - Summary of results for daylength in spring. 
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a: Phenotypic and p-value distributions. 




b: Genome-wide association mapping for climate adaptability. 

The plotted -logiop-values are genomic controlled. Markers with minor allele frequencies less than 5% are removed. 
Chromosomes are distinguished by colors. The Bonferroni -corrected significance threshold is marked by the horizon- 
tal line. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




1st Principle Component 



Supplementary Figure 14 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
12169701 bp. Corresponding climate variable: temperature seasonality. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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1st Principle Component 

Supplementary Figure 15 - Principle components of the genomic kinship for the two alleles on chromosome 4 at 
10406018 bp. Corresponding climate variable: temperature seasonality. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 16 - Principle components of the genomic kinship for the two alleles on chromosome 1 at 
6936457 bp. Corresponding climate variable: maximum temperature in the warmest month. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




Supplementary Figure 17 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
18620697 bp. Corresponding climate variable: minimum temperature in the coldest month. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 18 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
19397389 bp. Corresponding climate variable: minimum temperature in the coldest month. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




Supplementary Figure 19 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
14067526 bp. Corresponding climate variable: minimum temperature in the coldest month. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 20 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
18397418 bp. Corresponding climate variable: minimum temperature in the coldest month. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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1st Principle Component 

Supplementary Figure 21 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
18620697 bp. Corresponding climate variable: number of consecutive cold days. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




Supplementary Figure 22 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
19397389 bp. Corresponding climate variable: number of consecutive cold days. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 23 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
7492277 bp. Corresponding climate variable: number of consecutive cold days. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 24 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
18397418 bp. Corresponding climate variable: number of consecutive cold days. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




Supplementary Figure 25 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
12169701 bp. Corresponding climate variable: day length in spring. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 




1st Principle Component 



Supplementary Figure 26 - Principle components of the genomic kinship for the two alleles on chromosome 3 at 
12642006 bp. Corresponding climate variable: day length in spring. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 27 - Principle components of the genomic kinship for the two alleles on chromosome 4 at 
14788320 bp. Corresponding climate variable: day length in spring. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 28 - Principle components of the genomic kinship for the two alleles on chromosome 3 at 
1816353 bp. Corresponding climate variable: relative humidity in spring. 



Page 34/50 



Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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1st Principle Component 

Supplementary Figure 29 - Principle components of the genomic kinship for the two alleles on chromosome 4 at 
14834441 bp. Corresponding climate variable: relative humidity in spring. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 30 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
8380640 bp. Corresponding climate variable: relative humidity in spring. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 31 - Principle components of the genomic kinship for the two alleles on chromosome 3 at 
576148 bp. Corresponding climate variable: length of the growing season. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 32 - Principle components of the genomic kinship for the two alleles on chromosome 1 at 
953031 bp. Corresponding climate variable: number of consecutive frost-free days. 
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Genomic kinship principle components categorized based on geographical regions. 



o Britain 




-0.3 -0.2 -0.1 0.0 0.1 0.2 

1st Principle Component 



Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 33 - Principle components of the genomic kinship for the two alleles on chromosome 1 at 
6463065 bp. Corresponding climate variable: number of consecutive frost-free days. 
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Genomic kinship principle components categorized based on geographical regions. 
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Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 34 - Principle components of the genomic kinship for the two alleles on chromosome 2 at 
9904076 bp. Corresponding climate variable: number of consecutive frost-free days. 
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Genomic kinship principle components categorized based on geographical regions. 




Genomic kinship principle components colored based on the scale of the climate variable. 

The colors scale from pure blue (the minimum climate variable value) to pure red (the maximum value). 
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Supplementary Figure 35 - Principle components of the genomic kinship for the two alleles on chromosome 5 at 
18061531 bp. Corresponding climate variable: number of consecutive frost-free days. 
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Supplementary Figure 36: Comparison between the RegMap and lOOlgenomes collections 
in terms of the allele-frequency of CMT2stop across different geographic regions in the 
Euroasian A. thaliana population. The numbers in the bars are the number of CMT2stop alleles 
in this area. 
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Supplementary Figure 38: Methylome-wide association tests for CMT2stop genotypes. The 

significant methylation sites passing the Bonferroni-corrected significance threshold are marked 
in red, which were used in the validation analysis in the CMT2 knockouts. 
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Supplementary Figure 39: Genome-wide methylation patterns for different CMT2 
genotypes. The genome-wide CHH-methylation pattern is similar across the 131 A. thaliana 
CMT2»r(A), but not so among the 17 accessions carrying the loss-of-function {CMJIstop) 
allele or between the CMT2»r and CMT2srop accessions. The most divergently CHH methylated 
sites between natural CMT2»rand STOPcmti accessions are also differentially methylated 
between CMT2#t and cmt2 plants, illustrated here by the degree of methylation sharing at these 
sites amongst four CMJIwt and four cmt2 mutants (cmt2-3, cmt2-4, cmt2-5, cmt2-6) {Zemach: 
2013dj}(B). No such differential methylation was found in neither for CHG (C) nor CG (D) 
sites. The color key represent identity-by-methylation-state (IBMS) values in (A), and correlation 
coefficients of methylation scores in (B, C, D). 
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Supplementary Figure 40 - Comparison between the correlations among the climate variables (upper triangle) and 
the overlap in variance-heterogeneity GWA profiles (lower triangle). Numbers shown in the figure are percentages. 
Pearson's correlation coefficients were calculated for each pair of the climate variables. Overlaps in GWA profiles 
were calculated as the proportion of shared SNPs above the threshold of 1.0 x 10~ 4 . 



Page 46/50 



aridity index - 


4 


54 


1 


18 


42 


7 


51 


20 


0 


0 


14 


daylenglh in spring - 


29 


24 


54 


3 


46 


41 


46 


5 


49 


51 


35 


,00 


2 


relative humidity in spring - 


26 


1 1 


29 


16 


29 


22 


31 


2 


29 


26 


■ 


6 


0 


number of consecutive Irost-lree days - 


34 


11 


40 
72 


7 


25 


35 


31 


4 


39 




3 


8 


0 


number of consecutive cold days - 


57 


16 


1 


33 


25 


21 


2 


1 


8 


4 


8 


0 


length ol the growing season - 


7 


25 


9 


9 


10 


9 


; 


100 


4 


4 


8 


17 


0 


pholosynthetically active radiation in spring - 


22 


49 


22 


1 1 


37 


22 


1 


12 


12 


18 


6 


21 


9 


precipitation CV - 


9 


0 


26 


14 


34 


1 


7 


5 


2 


5 


2 


5 


2 


precipitation in the driest month - 


19 


44 


22 


21 


1 


2 


12 


0 


5 


5 


0 


2 


12 


precipitation in the wettest month - 


11 


15 


10 


1 


3 


0 


0 


0 


0 


0 


0 


0 


0 


minimum temperature in the coldest month ~ 


u t 


■ i 


■ 


n 
\j 


n 


0 
c. 


A 
*t 


q 


ifi 




7 


9 


0 


maximum temperature in trie warmest month - 


10 


■ 


0 


0 


19 


0 


6 


0 


0 


6 


0 


6 


0 


temperature seasonality - 


■ 




5 


0 


0 


0 


0 


8 


3 


0 


0 


2 


0 



tfl 0 O) OJ 



1 



100 

/b 
50 
25 



3 » en w B B 

S (t> © O o> a! 

8 1 f 1 f I 

3 > (U <u .c 



E 
1 



£ 

E 

E 

i 
s 



s 



Supplementary Figure 41 - Comparison between the correlations among the residual climate variables after genomic 
kinship correction (upper triangle) and the overlap in variance-heterogeneity GWA profiles (lower triangle). Numbers 
shown in the figure are percentages. Pearson's correlation coefficients were calculated for each pair of the climate 
variables. Overlaps in GWA profiles were calculated as the proportion of shared SNPs above the threshold of 1.0 x 
1(T 4 . 
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Supplementary Figure 42 - Comparison between the correlations among the residual climate variables after genomic 
kinship correction (upper triangle) and the correlations among the original climate variables (lower triangle). Numbers 
shown in the figure are percentages. Pearson's correlation coefficients were calculated for each pair of the climate 
variables. 
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Supplementary Figure 43 - Comparison between the correlations among the climate variables (upper triangle) and the 
overlap in ordinary GWA profiles (lower triangle). Numbers shown in the figure are percentages. Pearson's correlation 
coefficients were calculated for each pair of the climate variables. Overlaps in GWA profiles were calculated as the 
proportion of shared SNPs above the threshold of 1.0 x 10~ 4 . 



Page 49/50 



aridity index - 


33 


44 


7 


56 


56 


2 


47 


2 15 9 


daylenglh in spring - 


53 


57 


76 


21 


16 


0 


42 


49 63 63 



21 



relalive humidtly in spring - 



number of consecutive Irost-lree days - 


34 


43 


55 


28 


9 


number of consecutive cold days - 






27 


34 


length of Ihe growing season - 


- 


5 


64 


37 


28 


pholosynthetically active radiation in spring - 


7 


60 


22 


5 


37 




- 



precipitation CV - 


18 


14 


12 


29 


54 




29 


24 


6 


4 


7 


0 


precipitation in the driest month - 


43 


25 


23 


54 


■ 


40 


2 


36 


34 


9 


8 


14 


3 


precipitation in the wettest month - 


39 


10 


26 


■ 


38 


8 


4 


26 


26 


15 


11 


22 


3 


minimum temperature in the coldest month - 




28 




26 


35 


23 


1 


69 


85 


32 


8 


43 


1 


maximum temperature in lha warmest month - 


15 


■ 


5 


9 


6 


5 


27 


6 


3 


20 


2 


21 


15 


temperature seasonality - 


■ 


I 


62 


14 


23 


17 


1 


46 


64 


19 


8 


24 


11 



1 



100 

/b 
50 
25 



eg tii tn m ui CO 

S to O © w £S 

8 1 f 1 f I 

3 > (U <u .c 



E 
1 



£ 

E 

22 
E 

i 
s 



s 
J 



i I 

en «J 



Supplementary Figure 44 - Comparison between the correlations among the climate variables (upper triangle) and 
the overlap in simple GWA profiles without correction for population structure (lower triangle). Numbers shown in 
the figure are percentages. Pearson's correlation coefficients were calculated for each pair of the climate variables. 
Overlaps in GWA profiles were calculated as the proportion of shared SNPs above the threshold of 1.0 x 10~ 4 . 
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Table SI: Mis- and non-sense mutations in high-LD with genome-wide significant, non-additive associations to climate adaptability. 
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Table S2: Genes located less than 100Kb up- or down-stream of the leading SNP in the 

Genome-Wide Association analysis and that also are in high linkage disequilibrium with the SNP (r z > 0.8) 

Trait Gene 

Temperature seasonality 

AT4G18960 AG K-box region and MADS-box transcription factor family protein 
AT4G18970 GDSL-like Lipase/Acylhydrolase superfamily protein 
AT4G18975 Pentatricopeptide repeat (PPR) superfamily protein 
AT4G 18980 AtS40-3 

AT4G18990 XTH29 xyloglucan endotransglucosylase/hydrolase 29 
AT4G19003 VPS25 E2F/DP family winged-helix DNA-binding domain 
AT4G19030 NLM1 NOD26-like major intrinsic protein 1 
AT4G19035 LCR7 low-molecular-weight cysteine-rich 7 
AT4G19038 LCR15 low-molecular-weight cysteine-rich 15 
AT4G19040 EDR2 ENHANCED DISEASE RESISTANCE 2 
AT4G19045 Mobl/phocein family protein 

AT4G19050 NB-ARC domain-containing disease resistance protein 
AT4G19080 unknown protein 
AT4G19095 unknown protein 
AT4G19100 unknown protein 

AT4G19112 CPuORF25 conserved peptide upstream open reading frame 25 

.„ ERD3 S-adenosyl-L-methionine-dependent methyltransferases 

AT4G19120 , ., 

superfamily protein 

Maximum temperature in the warmest month 

AT1G19970 ER lumen protein retaining receptor family protein 

AT1G19980 cytomatrix protein-related 

AT1G20000 TAFllb TBP-associated factor 11B 

AT1G20010 TUB5 tubulin beta-5 chain 

AT1G20015 snoRNA 
Minimum temperature in the coldest month 

AT5G35926 Protein with RNI-like/FBD-like domains 
Number of consecutive cold days 

AT5G22555 unknown protein 

AT5G22570 WRKY38 WRKY DNA-binding protein 38 

Day length in spring 

AT3G30859 transposable element gene 
AT3G30867 pseudogene, putative SNF8 protein homolog 
Relative humidity in spring & Day length in spring 

AT4G30240 Syntaxin/t-SNARE family protein 

AT4G30250 ^"' 00 P conta ' n ' n g nucleoside triphosphate hydrolases superfamily 
protein 

AT4G30260 Integral membrane Yipl family protein 
AT4G30270 MERI5B xyloglucan endotransglucosylase/hydrolase 24 
AT4G30280 ATXTH18 xyloglucan endotransglucosylase/hydrolase 18 
AT4G30300 ATNAP15 non-intrinsic ABC protein 15 

AT4G30320 (Cysteine-rich secr etory proteins, Antigen 5, and Pathogenesis- 

related 1 protein) superfamily protein 
AT4G30330 Small nuclear ribonucleoprotein family protein 
AT4G30340 ATDGK7 diacylglycerol kinase 7 
Minimum temperature in the coldest month & Number of consecutive cold days 

AT2G47250 RNA helicase family protein 

AT2G45150 CDS4 cytidinediphosphate diacylglycerol synthase 4 
AT2G45160 HAM1 GRAS family transcription factor 
AT2G45161 unknown protein 
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AT2G45170 ATATG8E AUTOPHAGY 8E 

AT5G45380 ATDUR3 solute:sodium symporters;urea transmembrane transporters 
AT5G45390 CLPP4 CLP protease P4 
AT5G45400 RPA70C Replication factor-A protein 1-related 
AT5G45410 unknown protein 
Temperature seasonality & Day length in spring 

AT2G28410 unknown protein 

AT2G28420 Lactoylglutathione lyase /glyoxalase I family protein 
AT2G28426 unknown protein 
AT2G28430 unknown protein 
AT2G28440 proline-rich family protein 
AT2G28450 zinc finger (CCCH-type) family protein 
AT2G28460 Cysteine/Histidine-rich CI domain family protein 

Relative humidity in spring 

AT3G06019 unknown protein 
AT3G06020 unknown protein 
AT3G06030 ANP3 NPKl-related protein kinase 3 

AT5G24530 DMR6 2-oxoglutarate (20G) and Fe(ll)-dependent oxygenase 

superfamily protein 
AT5G24540 BGLU31 beta glucosidase 31 

Length of the growing season 

AT3G02660 Tyrosyl-tRNA synthetase, class lb, bacterial/mitochondrial 
AT3G02670 Glycine-rich protein family 
AT3G02680 NBS1 nijmegen breakage syndrome 1 
AT3G02690 nodulin MtN21 /EamA-like transporter family protein 
Number of consecutive frost-free days 

AT1G03780 TPX2 targeting protein for XKLP2 

AT1G03800 ERF10 ERF domain protein 10 

AT1G03810 Nucleic acid-binding, OB-fold-like protein 

AT1G03820 unknown protein 

AT1G03830 guanylate-binding family protein 

AT1G18720 unknown protein 

AT1G18730 NDF6 NDH dependent flow 6 

AT1G18735 other RNA 

AT1G18740 unknown protein 

AT1G18745 NcRNA 

AT1G18750 AGL65 AGAMOUS-like 65 

AT2G23250 UGT84B2 UDP-glucosyl transferase 84B2 

AT2G23260 UGT84B1 UDP-glucosyl transferase 84B1 

AT2G23270 unknown protein 

AT2G23290 AtMYB70 myb domain protein 70 

AT5G44740 POLH 

AT5G44750 REV1 

AT5G44760 C2 domain-containing protein 

AT5G44770 Cysteine/Histidine-rich CI domain family protein 



